library(tidyverse)
library(readxl)
path = "Excel/700-799/786/786 Sum of Items.xlsx"
input = read_excel(path, range = "A2:A9")
test = read_excel(path, range = "B2:C7")
result = input %>%
separate_longer_delim(col = "Data", delim = " / ") %>%
mutate(Quantity = str_extract(Data, "\\d+") %>% trimws(),
Item = str_extract(Data, "\\D+") %>% trimws()) %>%
mutate(Items = ifelse(str_detect(Item, "s$"), str_replace(Item, "s$", ""), Item)) %>%
summarise(Total = sum(as.numeric(Quantity), na.rm = TRUE), .by = Items)
all.equal(result, test)
# > [1] TRUEExcel BI - Excel Challenge 786
excel-challenges
excel-formulas
🔰 Find the sum of various items where number may be before or after the item names.

Challenge Description
🔰 Find the sum of various items where number may be before or after the item names.
Solutions
- Logic: Read the workbook ranges needed for the challenge; Derive the required intermediate columns; Parse the packed text or string structure; Aggregate or rank the data at the required grouping level.
- Strengths: The solution stays close to the text pattern itself, which makes the extraction logic easy to audit.
- Areas for Improvement: The solution assumes the workbook layout and selected ranges remain stable, so any structural change in the sheet would require small adjustments.
- Gem: A small number of well-targeted text patterns does most of the heavy lifting.
import pandas as pd
import re
path = "700-799/786/786 Sum of Items.xlsx"
input = pd.read_excel(path, usecols="A", skiprows=1, nrows=8)
test = pd.read_excel(path, usecols="B:C", skiprows=1, nrows=5).sort_values(by='Items').reset_index(drop=True)
input_long = input['Data'].dropna().str.split(' / ').explode().reset_index(drop=True)
df = input_long.str.extract(
r'(?:(?P<Quantity>\d+)\s*(?P<Item>[A-Za-z]+)|(?P<Item2>[A-Za-z]+)\s*(?P<Quantity2>\d+))'
)
df['Item'] = df['Item'].combine_first(df['Item2'])
df['Quantity'] = df['Quantity'].combine_first(df['Quantity2'])
df = df[['Quantity', 'Item']]
df['Items'] = df['Item'].str.rstrip('s').fillna('')
df['Quantity'] = pd.to_numeric(df['Quantity'], errors='coerce')
result = (
df.groupby('Items', as_index=False)['Quantity']
.sum()
.rename(columns={'Quantity': 'Total'})
.sort_values(by='Items')
.reset_index(drop=True)
)
print(result.equals(test)) # TrueThe Python version follows the same grouped logic and keeps the transformation explicit in a dataframe pipeline.
Difficulty Level
Medium
The individual steps are manageable, but the correct transformation pattern is not obvious from the raw data.